Fast optimization of non-convex Machine Learning objectives
نویسنده
چکیده
In this project we examined the problem of non-convex optimization in the context of Machine Learning, drawing inspiration from the increasing popularity of methods such as Deep Belief Networks, which involve non-convex objectives. We focused on the task of training the Neural Autoregressive Distribution Estimator, a recently proposed variant of the Restricted Boltzmann Machine, in applications to density estimation. The aim of the project was to explore the various stages involved in implementing optimization methods and choosing the appropriate one for a given task. We examined a number of optimization methods, ranging from derivative-free to second order and from batch to stochastic. We experimented with variations of these methods, presenting along the way all the major steps and decisions involved. The challenges of the problem included the relatively large parameter space and the non-convexity of the objective function, the large size of some of the datasets we used, the multitude of hyperparameters and decisions involved in each method, as well as the ever-present danger of overfitting the data. Our results show that second order Quasi-Newton batch methods like L-BFGS and variants of stochastic first order methods like Averaged Stochastic Gradient Descent outshine the rest of the methods we examined.
منابع مشابه
Proximal Algorithms in Statistics and Machine Learning
In this paper we develop proximal methods for statistical learning. Proximal point algorithms are useful in statistics and machine learning for obtaining optimization solutions for composite functions. Our approach exploits closedform solutions of proximal operators and envelope representations based on the Moreau, Forward-Backward, Douglas-Rachford and Half-Quadratic envelopes. Envelope repres...
متن کاملNew Optimisation Methods for Machine Learning
In this work we introduce several new optimisation methods for problems in machine learning. Our algorithms broadly fall into two categories: optimisation of finite sums and of graph structured objectives. The finite sum problem is simply the minimisation of objective functions that are naturally expressed as a summation over a large number of terms, where each term has a similar or identical w...
متن کاملFast Stochastic Variance Reduced ADMM for Stochastic Composition Optimization
We consider the stochastic composition optimization problem proposed in [17], which has applications ranging from estimation to statistical and machine learning. We propose the first ADMM-based algorithm named com-SVRADMM, and show that com-SVR-ADMM converges linearly for strongly convex and Lipschitz smooth objectives, and has a convergence rate of O(logS/S), which improves upon the O(S−4/9) r...
متن کاملFast Randomized Algorithms for Convex Optimization and Statistical Estimation
Fast Randomized Algorithms for Convex Optimization and Statistical Estimation
متن کاملFast Global Convergence via Landscape of Empirical Loss
While optimizing convex objective (loss) functions has been a powerhouse for machine learning for at least two decades, non-convex loss functions have attracted fast growing interests recently, due to many desirable properties such as superior robustness and classification accuracy, compared with their convex counterparts. The main obstacle for non-convex estimators is that it is in general int...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012